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ABSTRACT 

f^ . The wide-area imaging surveys with the Herschel Space Observatory at sub-mm wavelengths have 

^SJ ' now resulted in catalogs of order one hundred thousand dusty, star-burst galaxies. These galaxies 

J, I capture an important phase of galaxy formation and evolution, but unfortunately the redshift distri- 

^ |, bution of these galaxies, N(z), is still mostly uncertain due to limitations associated with counterpart 

.^ • identification at optical wavelengths and spectroscopic follow-up. We make a statistical estimate of 

' ^(z) using a clustering analysis of sub-mm galaxies detected at each of 250, 350 and 500 /zm from 

the Herschel Multi-tiered Extragalactic Survey (HerMES) centered on the Bootes field. We cross- 
Cn ' correlate Herschel galaxies against galaxy samples at optical and near-IR wavelengths from the Sloan 

Digital Sky Survey (SDSS), the NOAO Deep Wide Field Survey (NDWFS) and the Spitzer Beep Wide 
Field Survey (SDWFS). We create optical and near-IR galaxy samples based on their photometric or 
spectroscopic redshift distributions and test the accuracy of those redshift distributions with similar 
galaxy samples defined with catalogs from the Cosmological Evolution Survey (COSMOS), which has 
superior spectroscopic coverage. We model the clustering auto- and cross-correlations of Herschel and 
optical/IR galaxy samples to estimate N{z) and clustering bias factors. The 5*350 > 20mJy galaxies 

have a bias factor varying with redshift as b{z) = 1.0lQg(l + z)^'^-"^. This bias and the redshift 
dependence is broadly in agreement with galaxies that occupy dark matter halos of mass in the range 
of 10^^ to 10^^ Mq. We find that galaxy selections in all three SPIRE bands share a similar average 
redshift, with (z) = 1.8±0.2 for 250 /im selected samples, and (z) = 1.9±0.2 for both 350 and 500 /im 
samples, while their distributions behave differently. For 250 /im selected galaxies we find the a larger 
number of sources with z < 1 when compared with the subsequent two SPIRE bands, with 350 and 

C^ ', 500 /zm selected SPIRE samples having peaks in N{z) at progressively higher redshifts. We compare 

^ ■ our clustering-based N{z) results to sub-mm galaxy model predictions in the literature, and with an 

CO ' estimate of A^(z) using a stacking analysis of COSMOS 24 //m detections. 

2t . Subject headings: submillimeter: galaxies — Galaxies: evolution — Galaxies: high-redshift 
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1. INTRODUCTION 

The properties of the dusty, star-forming galaxies de- 
tected by the Speetral and Photometric Imaging Receiver 
(SPIRE; Griffin et aL 2010) aboard the Herschel Space 
Observatory^ (Pilbratt et al. 2010) at sub-mm wave- 
lengths provide important clues to the nature of dusty 
star-formation and the role of galaxy mergers in trig- 
gering such star-formation in distant galaxies. However, 
the redshift distribution of these galaxies has yet to be 
determined observationally. The low spatial resolution 
of Herschel-SPIRE observations complicate the identi- 
fication of counterparts at optical and near-IR wave- 
lengths. Moreover, the optical emission from these star- 
bursting galaxies is highly extinct and could potentially 
bias optical spectroscopy observations to low-redshift 
bright galaxies. Instead of optical or IR spectroscopy, 
mm and sub-mm wave spectroscopy can be pursued tar- 
geting fine-structure and molecular lines such as CO and 
[CII]. Such measurements, unfortunately, are currently 
limited to a handful of the brightest sources - mostly the 
rarely lensed sub-mm galaxies (e.g., Lupu et al. 2010; 
Scott et al. 2011; Riechers et al. 2011; Harris et al. 
2012), as existing instrumental capabilities do not allow 
large CO or [CII] surveys of typical sub-mm galaxies. 

There have been a few other approaches to obtain the 
N{z) of sources at these wavelengths. A statistical ap- 
proach based on photometry alone, using SPIRE colors, 
was considered in Amblard et al. (2010; see also Lapi et 
al. 2011), but such techniques are subject to uncertain- 
ties on the assumed spectral energy distribution (SED) 
of the galaxies at sub-mm wavelengths. These gener- 
ally involve isothermal SED models, where the redshift 
distribution is degenerate with the assu med dust tem- 
perat ure distribution. Marsden et al. (jMarsden et al.l 
120091 ) employed stacking methods to effectively measure 
the CIB as a function o f redshift, and Bethermin et al. 
(jBethermin et al. 1 12012[) have recently measured deep 
source counts as a function of redshift, also via stack- 
ing, which is compared to with results of this paper. 

Here we pursue a second statistical approach to mea- 
sure the SPIRE galaxy redshift distribution using the 
spatial clustering of the sub-mm population relative to 
clustering of galaxies with an a priori known redshift dis- 
tribution (Schneider et al. 2006; Newman 2008; Zhang 
et al. 2010). The unknown sub-mm redshift distribution 
can be estimated via the strength of its cross-correlation 
relative to galaxy samples of known redshifts. Modeling 
also requires that the clustering bias factors of all galaxies 
be determined jointly through a combination of auto and 
cross-correlation functions. The key advantage of this 
technique is that it does not require cross-identification 
of SPIRE sources in optical and IR catalogs. 

For this study we make use of data from the Herschel 
Multi-tiered Extragalactic Survey (HerMES; Oliver et 
al. 2012), which mapped a large number of well-known 
fields with existing multi- wavelength ancillary data using 
SPIRE. To cross-correlate against SPIRE-selected galax- 
ies, we make use of near-IR selected galaxy samples from 
Spiizer observations based on the 1.6 //m "bump", which 
has long been established as a redshift indicator (Saw- 

^'^ Herschel is an ESA space observatory with scienee instru- 
ments provided by European-led Principal Investigator consortia 
and with important participation from NASA. 



icki 2002; Simpson & Eisenhardt 1999; Wright & Fazio 
1994). The bump results from the fact that the H~ ab- 
sorption in stellar atmospheres is minimally opaque at 
1.6 yum. This leads to a bump in th e spectral e nergy 
distributions of cool stars at 1.6 ^m (| JohnI [19881 ) that 
is nearly ubiquitous in galaxy spectra. For z > 0, the 
wavelength at which the bump in the SED peaks allows 
for a redshift determination based on the colors in IRAC 
channels between 3.6 and 8 /xm, and covering the redshift 
range of 1 to 2.5. We complement these "bump" galaxy 
samples with a 24 ^m and an i?-band based sample of 
dust obscured galaxies, which has a redshift distribution 
that peaks around z ^ 2.3 (Dey et al. 2008). We also 
make use of optical-selected galaxy samples with SDSS 
spectroscopic and photometric redshifts out to about 0.7. 
The paper is organized as follows. In Section 2 we de- 
scribe source selection in SPIRE and galaxy sample se- 
lection with IRAC and MIPS bands, complemented with 
optical data to remove outliers. In Section 3 we describe 
the redshift distribution of the galaxy samples used for 
the cross-correlation analysis, and in Section 4 we de- 
scribe the cross-clustering measurements. Fitting results 
are presented in Section 5. In Section 6 we present N(z) 
and b{z), and discuss our results. We assume a flat- 
ACDM cosmological model and fix the cosmological pa- 
rameters to the best-fit values of ll ni = 0.27, fit = 0.046 , 
as = 0.81, Hs = 0.96 and h = 0.71 (jKomatsu et al.ll2011f) 
when performing MCMC model fits. 

2. SPIRE SOURCE AND GALAXY SAMPLE SELECTION 
2.1. Herschel-SPIRE sample 

The HerMES SPIRE source catalogs used for this pa- 
per come from a combined analysis involving both a di- 
rect source extraction and an attempt to account for 
blending at 350 and 500 fim wavelengths given the po- 
sitions of 250 ^m detections (Wang et al. in prep). The 
method updates the source extraction pursued by Her- 
MES at each of the three SPIRE bands independently 
that ignored issues associated with blending at longer 
wavelengths (Smith et al. 2011). 

In order to maximize the overlap with multi- 
wavelength data, we concentrate our study on the Bootes 
field with HerMES SPIRE data covering 12.5 deg^. The 
field has been imaged with Spitzer IRAC as part of 
the Spitzer Deep Wide Field Survey (SDWFS; Ashby et 
al. 2009) and from the ground with optical to near-IR 
observations as part of the NOAO Deep Wide Field Sur- 
vey (NDWFS; Jannuzi and Dey 1999), with coverage also 
provided by the Sloan Digital Sky Survey (SDSS; Abaza- 
jianet al. 2009). 

For this study we selected SPIRE sources with a fiux 
density greater than 20 mJy in the Bootes field. We 
find in excess of 22 000 sources in each of the SPIRE 
bands covering the entire 12.5 deg'^ of SPIRE observa- 
tions, while 3775, 3243 and 958 galaxies at 250 ^m, 350 
/.jm and 500 /im, respectively, were used in the cross- 
correlation study - an area covering 6.7 deg^, where var- 
ious ancillary data best overlap. 

At 20 mJy, the SPIRE catalogs are ~ 30% complete at 
each of the three wavelengths. The 90% completeness of 
the catalogs is at a flux density of about 55 mJy (Wang et 
al. in prep). At such a high flux density level, the number 
of SPIRE sources in the area overlapping with ancillary 



data is down by at least factor of 8 and the resulting low 
surface density does not allow useful constraints on the 
redshift distribution. We note some caution in interpret- 
ing our results with models due to the incompleteness. 
Wc are unable to correct for it through simulations due 
to the lack of a priori information on the redshift distri- 
bution of missing sources. It is unlikely, however, that 
the redshift distributions presented here are biased due 
to catalog incompleteness since the source detection al- 
gorithm is primarily based on the flux density and not 
the individual redshifts of SPIRE sources. 

2.2. IRAC Sample Selection and Star-Galaxy 
Separation 

Using the SDWFS data combined with ground-based 
7f-band data from NDWFS, we generated three differ- 
ent catalogs of 1.6 /xm-bump sources based on the IRAC 
channel where the SED peaks. These three samples are 
as follows: bunip-l with a peak in the 3.6 /im channel 
(0.5 ^ z < 1.5); bump-2 peaking in the 4.5 /zm chan- 
nel (0.8 < z < 2.2); and bump-3 with a peak at 5.8 /xm 
(1.5 '^ z < 3.0). Using the photometric redshifts com- 
puted via a template fitting method (Csabai et al. 2003) 
in the SDSS DR7 catalog, we also constructed two sep- 
arate redshift distributions with peaks at z ^ 0.3 and 
0.7. 

In order to establish catalogs of bump-1 to bump-3 
galaxy populations we first had to remove stars and other 
contaminants from our optical and IR catalogs. This was 
done using a combination of infrared and optical data. 
We used the SDWFS four-epoch stacked catalog (Ashby 
et al. 2009) which contains all sources detected in the 
first channel of IRAC at or above 5a. This catalog was 
matched with the NDWFS third data release catalog and 
the SDSS catalog, using a 2.5" matching radius. For 
sources with multiple matches (<3%), magnitudes from 
the NDWFS catalog were then compared with the 3.6 /im 
magnitude and entries with the most similar values were 
kept. 

Stars and spurious sources were removed from the re- 
sulting merged catalog using various techniques. Vega 
magnitudes and 6" diameter aperture photometry are 
used throughout the star-galaxy separation unless oth- 
erwise noted. We employed a three stage process to 
remove stars from our catalog. An initial selection of 
sources with [3.6] < 16 were identified as stars, where 
[3.6] represents the vega magnitude at 3.6 /im. Us- 
ing the combination of optical and IRAC photometry 
wc further classify sources as stars that either satisfy 
(Bw - /) > 2(/ - [ 3.6]) - 1.65, or -1.65 > (B^ - I) - 
2(/-[3.6]) > -3.35 (jEisenhardt et al.ll2004l) The former 
criterion defines a sequence of BIK stars (iHuang et al. 

mand the latter a seque nce of giant stars (jJohnson 
: lBesseUfc Bret^ ()1988l )). Lastly, for IRAC sources 
without optical counterparts, we used a binning method 
that involved only the IRAC bands (jWaddington et ahl 
|2007[ ). Three fiux density bins were defined with the cri- 
teria [3.6] < 19.5, 19.5 < [4.5] < 20.0 and [4.5] < 23, with 
color cuts [3.6] - [4.5] < -0.35, -0.30 and -0.25, respec- 
tively; all sources satisfying these criteria were assumed 
to be stars. The results of these extractions are shown 
in Fig. m 

2.3. Bump, DOG and SDSS Selections 
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Fig. 1. — Color-color (top; _Bw — / vs. / — [3.6]) and color- 
magnitude (bottom; [3.6] — [4.5] vs. [4.5]) density plots of sources 
(both stars and galaxies) in our Bootes field catalogs. Red stars 
indicate the 3.6 fim magnitude selection for stars, while giants are 
depicted as yellowr stars, and the BIK sequence as blue stars (see 
Section 2.2 for details). Only a fraction of the sources identified as 
stars are plotted to avoid over-crowding in the plots. Black points 
are the galaxies. 

From our resulting merged and star-subtracted cata- 
log, we invoked simple color constraints to classify three 
different types of bump sources and dust-obscured galax- 
ies (DOGs; Dey et al. 2008), using 4" aperture diameter 
photometry. Bump-1, bump-2 and bump-3 sources each 
display excess emission in IRAC channels 1, 2 and 3, 
respectively. Bump-l sources were selected using the cri- 
teria K - [3.6] > 0.1 and [3.6] - [4.5] < 0; bump-2 with 
X- [3.6] > 0, [3.6] - [4.5] > and [4.5] - [5.8] < 0; bump- 
3 with [3.6] - [4.5] > 0, [4.5] - [5.8] > and [5.8] - [8] < 0. 
The number of bump-1 sources in the SDWFS catalogs 
were found to be ^ 1.3 xlO'' at the 5cr detection limit 
of the 3.6 /im channel of the IRAC instrument. Bump-2 
source identification yielded 6.5 x 10^ galaxies, while the 
bump-3 catalog contains 4 x 10^ galaxies. 

We also make use of a sample of dust obscured galaxies, 
selected with 24 /tm Spitzer-MIPS and optical i?-band 
data to have extreme red colors from dust obstruction, 
with S24/Si<: > 1000 (where S24 is the 24 /im flux den- 
sity), or equivalently R — [24] > 14 and S24 > 0.3 mJy 
(w 6cr; Dey 2009). We found that a total of 2838 galax- 
ies satisfied the selection criteria. Based on spectroscopic 



follow-up, they a re now kno wn to have a mean redshift 
around z - 2.3 (|Devl 12009V We make use of the fuU, 
broad redshift distribution for this sample, spanning the 
range of 0.5 < z < 3.5, with a peak around z ~ 2, found 
from a similar identification of DOGs in the COSMOS 
field (see Section 3 and Fig. 2) for the present analy- 
sis. These dust-obscured galaxies have been suggested 
to be an intermediary phase of the e volution o f quasi- 
stellar objects from gas-rich mergers (jDevI |2009() . They 
have also been shown to be strongly clustered and are 
believed to be progenitors of massiv e (3 — 6L,) galaxies 
at low redshift (jBrowdin et al.ll2008| ). 

Finally, to cover the redshift range of < z < 0.7 
efficiently we also selected optical galaxies from SDSS. 
These sources have photometric redshifts, individually 
determined with SED fits to SDSS photometry, in the 
above range. We make use of 8 000 SDSS galaxies and 
we consider two sub-samples peaking at z ^ 0.2 and 
0.5 with roughly equal numbers. The first of these two 
sub-samples is obtained by selecting sources which obey 
2.6 < B - / < 3 and -0.8 < I - R < 0.1, while the 
second selection obeys B — I > A and —0.9 < I — R < 0. 
These six galaxy samples (3 bump catalogs, DOGs, and 
two SDSS samples) provide adequate redshift coverage 
over the range of < z < 3. 

3. COSMOS PHOTO-Z AND SPEC-Z 

While we are able to generate large samples of galax- 
ies to cross-correlate against the SPIRE catalogs of the 
Bootes field, the existing spectroscopic and photometric 
redshift information in the Bootes field is not adequate to 
establish the redshift distributions of the Spitzer galaxy 
samples. For that we turn to data in the Cosmological 
Evolution Survey (COSMOS; Scoville et al. 2006; Capak 
et al. 2007) where we can select similar galaxy samples 
as in the Bootes field, using the same depth and color 
criteria. For those galaxies we are able to use either the 
existing spectroscopic or the photometric redsh ifts from 
the pubhc COSMOS catalog (lllbert et al.l[200l . We as- 
sume that the redshift distributions for galaxy samples 
in COSMOS is the same as for Bootes when interpret- 
ing clustering measurements from the wider Bootes area 
that overlaps with SPIRE. 

The COSMOS field has extensive photometric red- 
shift measurements over 2 deg^ using 30 broad, in- 
termediate and narrow band filters from space-based 
telescopes {Hubble, Spitzer, GALEX, XMM, Chandra) 
and ground-based telescopes (Subaru, VLA, ESO-VLT, 
UKIRT, NOAO, CFHT, and others). We used a public 
COSMOS source catalog containing ~ 10^ spectroscopic 
redshifts, and ~ 3 x 10^ photometric redshifts that were 
computed using a x^ galaxy template-fitting technique 
(jllbert et al.l [20091 ). In both the Bootes and COSMOS 
fields, we imposed the same brightness thresholds in the 
selection bands: for IRAC channel 1 and K-h&nd we im- 
posed a 5ct detection limit. We require that each of the 
galaxies detected at above 5a is also detected in IRAC 
Channcel 2 to 4. In those cases, however, we considered 
a source that has a flux density above 50% completeness 
level to be considered as detected, while sources with 
flux densities below 50% completeness level of each of 
the three channels were dropped from the flnal catalog. 
This selection process was chosen to ensure that we arc 
probing equal depths between the two fields, COSMOS 
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Fig. 2. — N{z) distribution obtained from SDSS photometric 
redshifts in the Bootes field as well as buinp-l to -3 (top plot) 
and DOGs (bottom plot) redshift distributions from the COSMOS 
field. We assume the latter four redshift distributions measured 
directly in COSMOS are also applicable for galaxy samples under 
the same color selection criteria in the Bootes field. 

and Bootes. 

Using the same selection methods described above, we 
were able to obtain N(z) measurements for each of our 
different source classifications from the COSMOS source 
catalog (see Fig. [2). Note that the galaxy type selec- 
tions are mutually exclusive, so there are no overlap- 
ping sources between different samples. All of the bump 
sources have well defined redshift distributions, and the 
DOG distribution agrees well with those in the literature 
(see figure 1 in Dey 2009). We identified 384 sources as 
DOGs in COSMOS, with 5*24 > 0.3 /.tJy, and 683 with 
5*24 > 0.15 /.tJy, a number density consist ent with statis- 
tics o f the DOG population in other fields (jBrowdin et al.l 
[20081) . 

4. ANGULAR CROSS-CORRELATION AND COVARIANCE 
MATRIX 

To obtain the redshift distributions of SPIRE sources, 
we first cross-correlate the SDSS-selected sample and 
bumps and DOGs from the Bootes field, against sub- 
mm sources in each SPIRE band, from arcminute to de- 
gree angular scales. We also measure the auto-correlation 
functions of the galaxy and SPIRE samples, as these are 
needed to model the clustering strength and to extract 



the unknown rcdshift distribution. 

We use a bootstrap method to estabhsh the covari- 
ance matrix for each of the cross- or auto-correlation 
functions, as an accounting of the covariance is needed 
to properly model the clustering measurements. We do 
this by selecting 200 separate catalogs from the original 
SPIRE data by removing about 5% of the sources ran- 
domly. We measure the auto and cross-correlations with 
each of those catalogs and build the mean auto and cross- 
correlation functions, the variance from the scatter, and 
the covariance from the correlations between the mea- 
sured auto and cross-correlation functions. 

The angular cross-correlation function is modeled ana- 
lytically using the COSMOS rcdshift distribution of the 
bump-1, bump-2, bump-3 and DOGs, while for the SDSS 
galaxy samples we make use of the public photometric 
redshifts from SDSS DR-7. For simplicity we bin the un- 
known SPIRE rcdshift distribution from z = to 4 in 5 
bins in redshift. To extract the best- fit values and uncer- 
tainties in the redshift distribution bins, and the other 
parameters in the analytical model, we make use of a 
likelihood fitting technique based on the Markov Chain 
Monte Carlo (MCMC) method. 

In this section, we first discuss the method of model- 
ing the angular cross-correlation Across using the redshift 
distribution of galaxies and the linear matter power spec- 
trum. Then we describe the measurement of the Wcross 
functions as well as the covariance matrix from the galaxy 
samples. 

4.1. Modeling the angular cross-correlation 

The angular cross-correlation function Wcross for two 
galaxy samples is defined by 



lycross(^) = {Sni{$)dn2{4>')), 



(1) 



where dni((f>) = {ni{(j)) — ni)/ni, and ni{(j)) is the number 
density of galaxies observed in direction in the sky 
(0 = 0—0'), and fii is the mean number density of the 
galaxy sample i. Sui can be decomposed into two terms- 
one term from the real clustering of galaxies and a second 
term caused by lensing magnification. Here we ignore the 
few percent contribution from lensing (jWang et al.ll20lil ) 
and only consider the clustering term, which is 
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where 6i and &2, -^i(x) ^-nd A^2(x) ^-re the galaxy bias 
and the normalized radial distribution for the two galaxy 
samples, respectively. P{Xi k) is the power spectrum 
of the dark matter, Joix) = sin(a;)/a: is the zero-order 
Bessel function, and x ^^d r(x) ^re the radial comoving 
distance and the comoving angular diameter distance re- 
spectively {r{x) = X in flf^t space), xh denotes the ra- 
dial distance to the horizon, or Hubble length. Note that 
Wgg^O) will be zero if the positions of the two galaxy sam- 
ples do not overlap with eachother. 

When modeling the measurements, we make use of the 
linear theory power spectrum to describe P(x, k) and fo- 
cus only on modeling the measurements over the angular 



scales of 6', and above where the clustering is in the lin- 
ear regime (Cooray et al. 2000). At these large angular 
scales, the 1-halo term makes less than a 1% correction 
to the correlation function and can be safely ignored. 

4.2. The measurement and covariance matrix of the 
angular cross-correlation 

The angular cross-correlation function Wctoss{0) is de- 
fined as the fractional excess of the probab ility rela- 
tive to a random distribution (jPeeblesI 119801 ) . and can 
be measured from galaxy samples by the pair counts 
method. There are several kinds of estimators that are 
proposed to measure the cross-correlation (e.g. Blake 
et al. 2006); the one we adopt here is the modified 
Landy-Szal ay estimator which is d erived from the auto- 
correlation (jLandv fc SzalavlFlQQSD . 
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D1D2 — D1R2 — D2R1 + R1R2 
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where DiD2{9), DiR2{e), D2Ri{9) and RiR2{0) are the 
normalized pair counts for data (Di) and random (i?i) 
catalogs with separation 6. 

We generate random un-clustcred catalogs with vary- 
ing catalog sizes that contain 5 to 10 times more sources 
than the observed samples, with a larger number of 
sources than in data catalogs to avoid biases coming from 
Poisson fiuctuations. The angular cross-correlation ex- 
tracted from the observational data, and the theoretical 
estimation using the best-fit value (see next section) of 
the SPIRE distribution in the Bootes field are shown in 
FigEl The auto- and cross-correlation of the SPIRE sur- 
veys for 250 /im, 350 /im and 500 /im are also shown in 
Fig.H 

As was mentioned in the previous Section, to avoid 
biases coming from non-linear clustering we only use 
w{9) data from 0.1 to 1° to fit the model, since adding 
the 1-halo term with three or four extra parameters for 
the halo occupation number will result in extra degen- 
eracies, degrading the N{z) estimates, consistent with 
theoretical suggestions in the literature (e.g., Neyrinck 
et al. 2006). Also keeping to scales larger than 0.1°, 
we avoid the need to introduce a transfer function for 
w{9) for SPIRE sources and their cross-correlations since 
at smallest scales close to the SPIRE point response 
function, clustering is expected to be affected by source 
blending and issues related to map making. As stud- 
ied in Cooray et al. (2010), at 6* > 0.05°, there are no 
corrections to the measured w(6). 

To evaluate the covariance matrix of the angular corre- 
lation w{9), we use a bootstrap method to generate 200 
realizations for the galaxy samples. Then the covariance 
matrix of lOcross is 

1 ^ 



^tj 



k 



where N = 200 is the number of the bootstrap realiza- 
tion, and w{9) is the average angular correlation for all 
bootstrap realizations at 6. The error of th e an gular cor- 
relation thus takes the form of cFw{()i) ~ VCu- 

We use nine logarithmic bins from 0.01 to 1° to cal- 
culate the angular auto- and cross-correlation and their 
covariance matrix. The model correlation and cross- 



6 



correlation functions, w"\ arc calculated for a given N{z) 
and clustering bias factors (described in the next Sec- 
tion), and are compared with measurements. 



..data 



us- 



ing the covariance matrix from the data. In calculating 
w , we make use of the measured N{z) of the SDSS, 
bumps and DOGs, derived in the last section. 

5. ESTIMATING THE SPIRE GALAXY REDSHIFT 
DISTRIBUTION 

We employ a Markov Chain Monte Carlo (MCMC) 
technique , usin g the Metropolis-Hastings algorithm 
(jHastingsl [19701 ) ■ to fit the SPIRE redshift distribution 
N{z) of sources with flux densities greater than 20 mJy 
at each of the three wavelengths. We follow established 
standard procedures in fitting the data, including thin- 
ning of the chains and separation of steps that are part 
of the initial burn-in period. 

We describe the unknown redshift distribution N{z) at 
five values, using five "pivot" redshifts Zp = 0.1,0.5, 1,2 
and 3, and set N{zp = 0) = and N{zp = 4) = to 
describe the two end points of the redshift distribution. 
To describe N{z) when < z < 4, we linearly interpolate 
the fitted N{z) distribution at each of the pivot redshifts 
Zp and use those linearly interpolated values between two 
pivots in our model fitting algorithm. The assumption 
that N{z > 4) = does not bias our results since we only 
expect at most a few percent of the sub-mm galaxies to 
be located at z > 4 (e.g.. Pope & Chary 2010). Further- 
more, we do not have sensitivity to such high redshifts, 
given that the optical and ncar-IR galaxy samples we 
have used for the cross-correlation study are restricted 
to z < 4. 

Before deciding on this description, we also considered 
a description of N(z) that involved five bins in redshift, 
with N{z) taking the same value in each of the bins. 
However, we failed to obtain fits to the binned case since 
in the first bin < z < zi, N{z) prefers a value that is 
non-zero at zi, but zero at z = 0. The use of pivot red- 
shifts and linear interpolation between pivots avoids the 
discontinuities that were present with the binned case, 
leading to issues with the numerical integrations of the 
clustering in equation 2. 

As discussed earlier (related to equation 2), we also 
need to account for the clustering bias factor of galaxies 
and SPIRE sources relative to the linear matter power 
spectrum. Instead of keeping the bias in each of the 
bins as a free parameter, which leads to a large number 
of model parameters to be determined from the data, 
we assume a model for the galaxy bias, as a function of 
redshift, to be of the form 



b{z) = bo{i + zy 



(5) 



where Bq and c are free parameters to be determined from 
data using the MCMC analysis. In addition to this model 
we also consider two other approaches with: (i) 6(z) = 
6o + ^1^, a simple linear interpolation with redshift; and 
(ii) b{z) = bo when z < 2 and b{z) = bi when z > 2. We 
found results consistent within la uncertainties in both 
N{z) and 6(z) with the power-law form when using the 
linear relation. 

For optical and IR galaxy samples we assume that each 
has an average bias factor, and we do not account for 
the redshift evolution of the bias factor in each of the 
galaxy samples. This is a fair assumption since each 



of the samples we have created has a narrow redshift 
distribution compared to the distribution expected for 
the SPIRE galaxies. 

Altogether we have thirteen free parameters in our 
MCMC fitting, which contains five parameters for the 
SPIRE redshift distribution and six bias parameters for 
SDSS-1, SDSS-2, bump-1, bump-2, bump-3, and DOGs, 
plus two parameters to describe the SPIRE galaxy bias 
and its evolution with redshift. While the redshift distri- 
bution and bias factor and evolution for SPIRE sources 
are different at each of the three SPIRE wavelengths, the 
bias factors for optical and IR-selected galaxies remain 
the same. Thus, the six bias parameters for the galaxy 
samples, with assumed or known redshifts, can be de- 
termined jointly from cross-correlation data at the three 
SPIRE wavelengths together with their auto-correlation 
functions. We fix all the other cosmological parameters 
and assume the fiat ACDM model as mentioned in Sec- 
tion 1. 

We fit the data following the x^ distribution estimated 
as 

X'= Y. A^C-^A, (6) 

datascts 

where A = [u;'i^*^(6'i)-u;*'^(6'i), ..., w'^''^^{99)-w^^i99)], 
C is the covariance matrix of w{9), w^^ is obtained di- 
rectly from the N{z), and "data" here are the full an- 
gular cross-correlations for the SPIRE, SDSS-1, SDSS- 
2, bump-1, bump-2, bump-3 and DOG samples (21 
cross-correlations for each SPIRE band) , and their auto- 
correlations in the Bootes field. The angular auto- 
correlation and the cross-correlation between the SPIRE 
and the SDSSl, SDSS2, bumpl, bump2, bump3 and 
DOG sub-samples extracted from the observational data 
in the Bootes field are shown in Figs. [3] and |4] as exam- 
ples. 

We adopt an adapt i ve step -size Gaussian sampler given 
by iDoran fc M^IeilSl (J200l for the MCMC fitting pro- 
cess. The conver g ence c riterion we take is discussed in 
iGelman fc RubinI (|1992[ ). We generate six chains with 
about 10^ points after the convergence process. At the 
end we resample the chains to get about 10 000 points to 
illustrate the probability distribution of the parameters. 

6. RESULTS AND DISCUSSION 

In Fig. Owe show the best-fit results and the la errors 
of the redshift distribution, N{z), for the three SPIRE 
bands (see also Table 1 for the values). The redshift dis- 
tributions are normalized such that j dzN{z) = 1. The 
Icr error bars (cyan lines) are derived from the Markov 
chains, which are statistically estimated via the values 
of N{z) calculated using each chain point at different 
redshifts. As an example, the 100 best-fit N(z) are also 
shown in yellow dotted lines. As shown by the errors 
of N{z) at high redshift (z > 3), the galaxy distribu- 
tion could be larger when going from the 250 /xm to 
500 fim bands, which implies there may be more high- 
redshift galaxies for the 500 fim band than the 250 /im 
and 350 fim bands. In Table 1 we also tabulate the 
average redshift of the SPIRE sources by calculating 
J dz zN{z), and these values range from 1.8 ± 0.2 at 
250 fim to 1.9 ± 0.2 for 500 ^m. We also derive the corre- 
lation coefficient for the N{z) at five pivot redshifts from 
our Marcov chains (see Appendix). We find the correla- 
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Fig. 3. — Angular cross-correlation between the 250, 350, and 500 //m objects of the SPIRE surveys and the SDSSl, SDSS2, bumpl, 
bunip2, bump3 and DOGs in the Bootes field. The la error bars are derived from 200 bootstrap realizations. The black dashed lines are 
theoretical estimates of the angular cross-correlation using the best-fit value of the SPIRE redshift distribution. 
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Fig. 4. — Angular auto- and cross-correlation for the 250, 350, and 500 ^m SPIRE sources in the Bootes field. The Icr error bars are 
derived from 200 bootstrap realizations. The black dashed lines are the theoretical estimation using the best-fit value of the SPIRE redshift 
distribution. 
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Fig. 5. — Best-fit normalized redshift distributions (red solid line) and la error regions (cyan line) for sources with flux densities greater 
than 20 mJy for 250, 350 and 500 fim SPIRE bands using SDSS, bump-1 to bump-3, and DOGs catalogs in the Bootes field. As examples 
100 best-fit N(z) from our MCMC results are also shown in yellow dotted lines. The analytical model predictions on N{z) from the literature 
(blue dashed line; Bethermine et al. 2010) for galaxies in the three SPIRE bands are also shown for comparison. The green line is a direct 
estimate of N(z) using the combination of a stacking and a cross-identification analysis involving 24 fim MIPS and SPIRE sources from 
Bethermin et al. (2012). 

tion is weak (~10%) between adjacent A'i pivots for each SPIRE band. 



TABLE 1 
The best-fit SPIRE redshift distribution and bias 

PARAMETERS 



N{z) 


2-pivot 


250 ^m 


350 fim 


500 Aim 


Ni 


0.1 


19+"" 


1fi+0-23 


o.oot°:i« 


N2 


0.5 


0.36t«;f, 


0.27t°fe 


0-09tUl 


N3 


1.0 


f, .11 +0.21 


0.30t°Ji 


0.15«;?^ 


N4. 


2.0 




oy+o-i** 


0.39±0;}J 


N5 


3.0 


'J-^'J-0.20 


34+012 


46+°-°'' 


Average Redshift 










(^> 




1.8 ±0.2 


1.9 ±0.2 


1.9 ±0.2 


Sub-mm Bias 



bo 



1.0: 



1.1^ 



1.0 



+ 1.0 
0.5 



0.9 



+0.6 
0.5 



1.2 



+0.3 



l.V 



Two additional N(z) predictions from the literature 
arc also shown in the plot for comparison. The dashed 
line is a direct estimate oi N{z) from PSF-fitted extrac- 
tion using 24 fim positions as a prior (jBethermin at alTI 
|2012( ) and the green curve is a model prediction fo r 
the SPIRE redshift distribution (|Bcthermin et al.ll2010f ). 
Our estimation for N{z) for the 250 yum band agree well 
with both the direct extraction based on 24 fiiji iden- 
tifications and a model prediction, while we find some 
differences at 350 and 500 ^m. However, given the large 
uncertainties in our binned N{z) estimate these differ- 
ences are statistically insignificant. 



TABLE 2 
Bias factors of optical and IR-selected galaxy samples 

Sample Approximate z-Range bias 



SDSS-1 


0- 


0.4 


SDSS-2 


0.3- 


-0.7 


Bump-1 


0.8- 


-1.5 


Bump-2 


1.2- 


-2.0 


Bump-3 


1.6- 


-2.5 


DOGs 


0.7- 


-3.0 



1.6 



+0.2 
0.2 



1.1 



+0.2 



2.0 



+0.3 
0.3 



2.3"* 



2.0 



+0.! 



2.6 



+1.1 
1.9 



estimated bias factor of 3.1 ± 0.5 (Brodwin et al. 2008), 
which can be compared to our estimate of 2.6t.\'l- While 
fully consistent with the Brodwin et al. (2008) estimate, 
our central value is lower than their value, as we account 
for the full redshift distribution of these galaxies, while 
their analysis assumed a redshift of 2 for the whole DOG 
sample in the Bootes field. 

In Fig. [7] we show the 68% confidence contour maps 
of the bias factors of SPIRE sources at the three wave- 
lengths (the values and uncertainties are listed in Ta- 
ble 1). We generally find that the SPIRE galaxy bias 
factors are consistent with b{z) ^ 1 + z (i.e. c« 1). To 
understand further the evolution of the sub-mm galaxy 
bias factor, we plot the redshift dependence in Fig. |S1 
where we compare with the bias factor of dark matter 
halos at several halo masses, from dwarf galaxy mass 
to galaxy cluster scales. The bias factors we find at all 
three wavelengths indicate a halo mass in the range of few 
times 10^° to few times 10" Mg. The SPIRE clustering 
analysis in Cooray et al. (2010) found a halo mass for 
sub-mm galaxies that is about 3x10^^ Mg, under the as- 
sumption of a redshift distribution for the sub-mm galaxy 
population with a peak at ^ ~ 2.3, similar to the DOG 
redshift distribution in Fig. 2. We now find a slightly 
lower bias factor, and this is primarily due to the fact 
that the underlying redshift distribution of the SPIRE 
galaxies, especially at 250 /.tm, contains more sources at 
lower redshifts (z < 1). While the result here is for bright 
sub-mm sources that are individually detected, the model 
interpretation of the SPIRE anisotropy power spectrum 
by Amblard et al. (2011) found a minimum halo mass of 
3x 10" Mq. 

In Fig. [8] we also compare the SPIRE sub-mm 
galaxy bias factors to samples of galaxies and qu asars 
from the literatur e (jShen et all l2007t iRoss et alj 120091 : 
iHickox et al.ll2012h . Our results are generally consistent 
with the possibility that SMGs and quasars trace simi- 
lar evolutionary paths and that the hosts correspond to 
dark matter halos that contain a '^ few L^ ellipticals 
at z ~ 0. The exact mechanism on how the starburst 
galaxies seen in SPIRE feed the black holes that result 
in the quasars, and the subsequent feedback that sup- 
presses star-formation, remains uncertain. 

In Fig. [5] we also plot two models for the evolution 
of the bias factor of merging galaxies from Hopkins et 
al. (2007). While these models have similar behavior at 
z < 3, differences exist at higher redshift. A clustering 
study of SPIRE-selected sub-mm galaxies at 2 > 4 on its 
own, or as a cross-correlation with high-redshift quasars, 
could potentially be used to understand the intricate role 
of starbursts and quasars and to separate the subsequent 
feedback processes. 



In addition to N{z) and bias factors of SPIRE-selected 
galaxies, we also measure the bias factors of the optical 
and IR-selected galaxy samples that we have used for 
cross-correlations. In Fig.[n]we show the two-dimensional 
error plots, and in Table [2] we list the best-fit bias val- 
ues and their uncertainties. These results are obtained 
by combining the likelihoods from the MGMG chains of 
all three SPIRE bands. These values are consistent with 
values quoted in the literature for the bias of these sam- 
ples. For example, the dust-obscured galaxies have an 



7. CONCLUSIONS 

The wide-area sub-mm surveys with the SPIRE instru- 
ment aboard the Herschel Space Observatory have now 
led to catalogs of order one hundred thousand dusty, 
star-forming galaxies at 250, 350, and 500 /im. While 
some properties of this sub-mm source population are 
now understood, the redshift distribution of these galax- 
ies. N{z), is not yet well determined observationally. We 
make a statistical estimate of N{z) using a clustering 
analysis involving the cross-correlation of sub-mm galax- 
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Fig. 6. — The one and two-dimensional probability distribution functions for bias parameters for all of the Bootes samples used throughout 
this paper. These bias parameters are estimated by combining the likelihoods from the MCMC chains of all three SPIRE bands. The 
68.3%, 95.5% and 99.7% uncertainties from the fits are shown in the two-dimensional error plots. 
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Fig. 7. — The 68% contour maps of the bias parameters bo and 
c, determined from the MCMC analysis, at 250, 350 and 500 fira 
with S > 20 mjy in the Bootes field. 
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Fig. 8. — Clustering bias of Sgso > 20 mJy SPIRE sourecs as a 
function of rcdshift. The shaded region shows the 68% confidence 
region allowed, with the blue solid line showing the best-fit b{z) 
relation. For reference we plot the bias factor of dark matter halos 
as a function of halo mass. The range allowed by the data over 
< z < 4 is occupied by halos with mass 10 < log M/Mq < 14. 
We also show samples of galaxies and quasars from the literature 
aijhen et al] [200?: Tlo ss et al.l [20091 : IHickox et aLllMig ). and two 
models for the evolution of the bias factor of merging galaxies from 
IHopkins et al.i (j2007i ) involving all three models at i = 20.2 (dash- 
dotted lines) and inefficient (black solid) feedback. 

ies detected at each of 250, 350 and 500 /im from the Her- 
schel Muhi-tiercd Extragalactic Survey (HcrMES) cen- 
tered on the Bootes field, against samples of galaxies de- 
tected at optical and ncar-IR wavelengths from the Sloan 
Digital Sky Survey (SDSS), the NOAO Deep Wide Field 
Survey (NDWFS), and the Spitzer Deep Wide Field Sur- 



vey (SDWFS). 

We create optical and near-IR galaxy samples based on 
their photometric or spectroscopic redshift distributions 
and test the accuracy of these redshift distributions 
with similar galaxy samples defined via catalogs of the 
Cosmological Evolution Survey (COSMOS). We fit the 
clustering auto and cross-correlations of SPIRE and 
optical/IR galaxy samples at angular scales of 0.1 to 
1°, where clustering of each of the galaxy samples is ex- 
pected to be linear, with the amplitude determined by a 
bias factor together with the redshift distribution of the 
sources. We make use of a Markov Chain Monte Carlo 
(MCMC) method to sample N{z) at five nodes in the 
range < z < 4, as well as the bias factors. The SPIRE- 
selected sub-mm galaxy bias factor is found to vary with 



We 



redshift according to b{z) = 1.0Io;^(l + z)'-''- 
find clear evidence of evolving redshift distributions as 
the wavelength increases from 250 //m to 500 ^m, with 
the 250 /im band containing the largest number of low 
redshift sources. We also compare the measured redshift 
distribution to model predictions in the literature, and 
find an excess of sources in the highest redshift bin when 
compared to the model prediction from Bethermin et al. 
(2010), although in general our results agree with both 
predictions from the literature. With subsequent ob- 
servations in more fields, this analysis could potentially 
be carried out again - incorporating more data in this 
analysis would reduce the size of the errors and more 
fully constrain the N{z) of these sub-mm galaxies. 
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APPENDIX 
THE CORRELATION COEFFICIENT OF THE N(Z) 
In Table [3] we show the correlation coefficient of the N(z) at five pivot rcdshifts (A'i) for three SPIRE bands, which 



is derived from our Marcov chains. The definition is given by 

cov(Ni,m 
r = ^ —. 

Here cov(A^i, A^j), crjVi and ctat. are the covariance matrix and standard deviations for Ni and A'j, respectively. 



(Al) 



TABLE 3 

The correlation coefficient of the N(Zi) at five pivot redshifts for three SPIRE bands. 
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